# Common Voice Fine-tuning
Disper Small Salam
Apache-2.0
Arabic speech recognition model fine-tuned based on OpenAI Whisper-small
Speech Recognition
Transformers Arabic

D
Duino
14
1
Whisper Medium Cv11 German Ct2
Apache-2.0
Automatic speech recognition model fine-tuned on the Common Voice 11.0 German dataset based on OpenAI's whisper-medium model
Speech Recognition
Transformers German

W
mkenfenheuer
21
1
Whisper Tiny Chinese
Apache-2.0
A speech recognition model fine-tuned on the Common Voice 11.0 Chinese dataset based on OpenAI Whisper Tiny model
Speech Recognition
Transformers Chinese

W
jethrowang
99
1
Whisper Small Turkish V2
Apache-2.0
A speech recognition model fine-tuned on the Turkish Common Voice dataset based on OpenAI Whisper-small
Speech Recognition
Transformers Other

W
atakanince
61
2
Speecht5 Finetuned Common Voice Be
MIT
Belarusian text-to-speech model based on Microsoft SpeechT5 architecture, fine-tuned on the Common Voice dataset
Speech Synthesis
Transformers Other

S
KoRiF
27
0
Speecht5 Tts Common Voice Uk
MIT
A Ukrainian text-to-speech model fine-tuned based on Microsoft's SpeechT5 architecture, trained using the Common Voice dataset
Speech Synthesis
Transformers Other

S
ewigeki
47
3
Whisper Large V2 Serbian
Apache-2.0
This is a speech recognition model fine-tuned on the Serbian Common Voice 11.0 dataset based on OpenAI Whisper Large-V2, achieving a word error rate of 10.76%.
Speech Recognition
Transformers Other

W
DrishtiSharma
39
3
Whisper Large V2 Hindi 2.5k Steps
Apache-2.0
This is a Hindi automatic speech recognition (ASR) model fine-tuned based on OpenAI Whisper Large V2, trained on the Common Voice 11.0 dataset with a word error rate (WER) of 10.05%.
Speech Recognition
Transformers Other

W
DrishtiSharma
52
2
Whisper Large V2 Hi V3
Apache-2.0
Hindi speech recognition model fine-tuned based on OpenAI Whisper Large-v2, achieving a word error rate of 11.3% on the Common Voice 11.0 Hindi test set
Speech Recognition
Transformers Other

W
anuragshas
21
1
Whisper Medium French
Apache-2.0
A French speech recognition model fine-tuned on the common_voice_11_0 dataset based on openai/whisper-medium, achieving a standardized WER of 11.1406, outperforming the original model.
Speech Recognition
Transformers French

W
pierreguillou
260
9
Exp W2v2t Sv Se R Wav2vec2 S418
Apache-2.0
A Swedish automatic speech recognition model fine-tuned from facebook/wav2vec2-large-robust, supporting 16kHz sampling rate audio input.
Speech Recognition
Transformers

E
jonatasgrosman
32
0
Exp W2v2t Fr Xls R S250
Apache-2.0
An automatic speech recognition model fine-tuned using the Common Voice 7.0 French dataset, based on the facebook/wav2vec2-xls-r-300m model
Speech Recognition
Transformers French

E
jonatasgrosman
20
0
Exp W2v2t Ja Vp It S544
Apache-2.0
A Japanese automatic speech recognition model fine-tuned using the training set of Common Voice 7.0 (Japanese version), based on the facebook/wav2vec2-large-it-voxpopuli model.
Speech Recognition
Transformers Japanese

E
jonatasgrosman
18
0
Exp W2v2t Ja Unispeech Sat S884
Apache-2.0
A Japanese automatic speech recognition model fine-tuned based on the microsoft/unispeech-sat-large model, trained using the Common Voice 7.0 Japanese dataset.
Speech Recognition
Transformers Japanese

E
jonatasgrosman
19
0
Exp W2v2t Ja Wavlm S729
Apache-2.0
A Japanese automatic speech recognition model fine-tuned based on microsoft/wavlm-large, trained using the Common Voice 7.0 Japanese dataset
Speech Recognition
Transformers Japanese

E
jonatasgrosman
15
2
Exp W2v2t Ja Unispeech S569
Apache-2.0
A Japanese automatic speech recognition model fine-tuned using the Common Voice 7.0 (Japanese) dataset, based on the microsoft/unispeech-large-1500h-cv model
Speech Recognition
Transformers Japanese

E
jonatasgrosman
14
0
Exp W2v2t En Unispeech Sat S459
Apache-2.0
An English speech recognition model fine-tuned based on Microsoft's UniSpeech-SAT-Large model, supporting 16kHz sampled audio input.
Speech Recognition
Transformers English

E
jonatasgrosman
22
0
Wav2vec2 Large Xlsr 53 German Cv9
Apache-2.0
This is an automatic speech recognition (ASR) model fine-tuned on the German Common Voice 9.0 dataset, based on Facebook's wav2vec2-large-xlsr-53 model.
Speech Recognition
Transformers German

W
oliverguhr
98
1
Wav2vec2 Large Xls R 300m Turkish Colab
Apache-2.0
This model is a speech recognition model fine-tuned on the Common Voice Turkish dataset based on facebook/wav2vec2-xls-r-300m.
Speech Recognition
Transformers

W
vai6hav
23
0
Wav2vec2 Large Xls R 300m Hindi Home Colab 11
Apache-2.0
This model is a Hindi speech recognition model fine-tuned on the Common Voice dataset based on facebook/wav2vec2-xls-r-300m
Speech Recognition
Transformers

W
nimrah
22
0
Wav2vec2 Large Xls R 300m Ia
Apache-2.0
An automatic speech recognition model fine-tuned on the Common Voice 8.0 international language dataset based on facebook/wav2vec2-xls-r-300m
Speech Recognition
Transformers Other

W
ayameRushia
23
0
Wav2vec2 Large Xlsr 53 Ir
Apache-2.0
An Irish Gaelic automatic speech recognition model fine-tuned on wav2vec2-large-xlsr-53, trained on the Common Voice 7.0 dataset
Speech Recognition
Transformers

W
jcmc
24
0
Wav2vec2 Xls R 1b De Cv8
Apache-2.0
An automatic speech recognition model fine-tuned on the Common Voice 8 German dataset based on facebook/wav2vec2-xls-r-1b
Speech Recognition
Transformers German

W
jsnfly
22
0
Wav2vec2 Large Xlsr Eo
Apache-2.0
A speech recognition model fine-tuned for Esperanto using the Common Voice dataset, based on the facebook/wav2vec2-large-xlsr-53 model.
Speech Recognition Other
W
gchhablani
23
1
Wav2vec2 Xlsr 300m German Truecase
Based on Facebook's wav2vec2-xls-r-300m model, fine-tuned on the Common Voice German dataset, supporting German speech recognition with preserved text case information.
Speech Recognition
Transformers

W
abnerh
16
1
English Model
An English fine-tuned speech recognition model based on facebook/wav2vec2-large, using the Common Voice dataset, supporting 16kHz sampled audio input.
Speech Recognition
Transformers

E
tanmayplanet32
30
0
Wav2vec2 Large Xlsr 53 Hk
Apache-2.0
A speech recognition model fine-tuned on Cantonese (using the Common Voice dataset) based on facebook/wav2vec2-large-xlsr-53
Speech Recognition
Transformers

W
voidful
26
2
Featured Recommended AI Models